OcrV1, Main, Exploration, bibRecord, 001420

A video-based framework for the analysis of presentations/posters

Identifieur interne : 001420 ( Main/Exploration ); précédent : 001419; suivant : 001421

A video-based framework for the analysis of presentations/posters

Auteurs : A. Zandifar [États-Unis] ; R. Duraiswami [États-Unis] ; L. S. Davis [États-Unis]

Source :

International journal on document analysis and recognition : (Print) [ 1433-2833 ] ; 2005.

RBID : Pascal:05-0421779

Descripteurs français

Pascal (Inist)
- Extraction forme, Rectification, Ontologie, Formation image, Reconnaissance forme, Reconnaissance caractère, Donnée textuelle, Séquence image, Traitement image, Interprétation image, Traitement document, Reconnaissance optique caractère, Vision ordinateur.

English descriptors

KwdEn :
- Character recognition, Computer vision, Document processing, Image interpretation, Image processing, Image sequence, Imaging, Ontology, Optical character recognition, Pattern extraction, Pattern recognition, Rectification, Textual data.

Abstract

Detection and recognition of textual information in an image or video sequence is important for many applications. The increased resolution and capabilities of digital cameras and faster mobile processing allow for the development of interesting systems. We present an application based on the capture of information presented at a slide-show presentation or at a poster session. We describe the development of a system to process the textual and graphical information in such presentations. The application integrates video and image processing, document layout understanding, optical character recognition (OCR), and pattern recognition. The digital imaging device captures slides/poster images, and the computing module preprocesses and annotates the content. Various problems related to metric rectification, key-frame extraction, text detection, enhancement, and system integration are addressed. The results are promising for applications such as a mobile text reader for the visually impaired. By using powerful text-processing algorithms, we can extend this framework to other applications, e.g., document and conference archiving, camera-based semantics extraction, and ontology creation.

Affiliations:

Links toward previous steps (curation, corpus...)

to stream PascalFrancis, to step Corpus: 000439
to stream PascalFrancis, to step Curation: 000348
to stream PascalFrancis, to step Checkpoint: 000436
to stream Main, to step Merge: 001466
to stream Main, to step Curation: 001420

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">A video-based framework for the analysis of presentations/posters</title>
<author><name sortKey="Zandifar, A" sort="Zandifar, A" uniqKey="Zandifar A" first="A." last="Zandifar">A. Zandifar</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Perceptual Interfaces and Reality Lab (PIRL), University of Maryland</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
</affiliation>
</author>
<author><name sortKey="Duraiswami, R" sort="Duraiswami, R" uniqKey="Duraiswami R" first="R." last="Duraiswami">R. Duraiswami</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Perceptual Interfaces and Reality Lab (PIRL), University of Maryland</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
</affiliation>
</author>
<author><name sortKey="Davis, L S" sort="Davis, L S" uniqKey="Davis L" first="L. S." last="Davis">L. S. Davis</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Perceptual Interfaces and Reality Lab (PIRL), University of Maryland</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">05-0421779</idno>
<date when="2005">2005</date>
<idno type="stanalyst">PASCAL 05-0421779 INIST</idno>
<idno type="RBID">Pascal:05-0421779</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000439</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000348</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000436</idno>
<idno type="wicri:doubleKey">1433-2833:2005:Zandifar A:a:video:based</idno>
<idno type="wicri:Area/Main/Merge">001466</idno>
<idno type="wicri:Area/Main/Curation">001420</idno>
<idno type="wicri:Area/Main/Exploration">001420</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">A video-based framework for the analysis of presentations/posters</title>
<author><name sortKey="Zandifar, A" sort="Zandifar, A" uniqKey="Zandifar A" first="A." last="Zandifar">A. Zandifar</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Perceptual Interfaces and Reality Lab (PIRL), University of Maryland</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
</affiliation>
</author>
<author><name sortKey="Duraiswami, R" sort="Duraiswami, R" uniqKey="Duraiswami R" first="R." last="Duraiswami">R. Duraiswami</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Perceptual Interfaces and Reality Lab (PIRL), University of Maryland</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
</affiliation>
</author>
<author><name sortKey="Davis, L S" sort="Davis, L S" uniqKey="Davis L" first="L. S." last="Davis">L. S. Davis</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Perceptual Interfaces and Reality Lab (PIRL), University of Maryland</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint><date when="2005">2005</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Character recognition</term>
<term>Computer vision</term>
<term>Document processing</term>
<term>Image interpretation</term>
<term>Image processing</term>
<term>Image sequence</term>
<term>Imaging</term>
<term>Ontology</term>
<term>Optical character recognition</term>
<term>Pattern extraction</term>
<term>Pattern recognition</term>
<term>Rectification</term>
<term>Textual data</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Extraction forme</term>
<term>Rectification</term>
<term>Ontologie</term>
<term>Formation image</term>
<term>Reconnaissance forme</term>
<term>Reconnaissance caractère</term>
<term>Donnée textuelle</term>
<term>Séquence image</term>
<term>Traitement image</term>
<term>Interprétation image</term>
<term>Traitement document</term>
<term>Reconnaissance optique caractère</term>
<term>Vision ordinateur</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Detection and recognition of textual information in an image or video sequence is important for many applications. The increased resolution and capabilities of digital cameras and faster mobile processing allow for the development of interesting systems. We present an application based on the capture of information presented at a slide-show presentation or at a poster session. We describe the development of a system to process the textual and graphical information in such presentations. The application integrates video and image processing, document layout understanding, optical character recognition (OCR), and pattern recognition. The digital imaging device captures slides/poster images, and the computing module preprocesses and annotates the content. Various problems related to metric rectification, key-frame extraction, text detection, enhancement, and system integration are addressed. The results are promising for applications such as a mobile text reader for the visually impaired. By using powerful text-processing algorithms, we can extend this framework to other applications, e.g., document and conference archiving, camera-based semantics extraction, and ontology creation.</div>
</front>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Maryland</li>
</region>
<settlement><li>College Park (Maryland)</li>
</settlement>
<orgName><li>Université du Maryland</li>
</orgName>
</list>
<tree><country name="États-Unis"><region name="Maryland"><name sortKey="Zandifar, A" sort="Zandifar, A" uniqKey="Zandifar A" first="A." last="Zandifar">A. Zandifar</name>
</region>
<name sortKey="Davis, L S" sort="Davis, L S" uniqKey="Davis L" first="L. S." last="Davis">L. S. Davis</name>
<name sortKey="Duraiswami, R" sort="Duraiswami, R" uniqKey="Duraiswami R" first="R." last="Duraiswami">R. Duraiswami</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001420 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001420 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:05-0421779
   |texte=   A video-based framework for the analysis of presentations/posters
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

A video-based framework for the analysis of presentations/posters

A video-based framework for the analysis of presentations/posters

Source :

Descripteurs français

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri